๐Ÿฟ๏ธ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
๐Ÿง  Inference Serving

Request Batching, Model Loading, Throughput Optimization, Latency Management

Jan Nano + Deepseek R1: Combining Remote Reasoning with Local Models using MCP
huggingface.coยท5hยท
Discuss: r/LocalLLaMA
๐Ÿ“‹MCP
How to use Gemini 2.5 to fine-tune video outputs on Vertex AI
cloud.google.comยท22h
๐Ÿ“ŠFeed Optimization
Slashing CI Costs at Uber
uber.comยท4hยท
Discuss: Hacker News
๐Ÿ› ๏ธBuild Optimization
DRIFT: Data Reduction via Informative Feature Transformation- Generalization Begins Before Deep Learning starts
arxiv.orgยท10h
๐Ÿ“ŠVector Databases
Build a Personalized AI Assistant with Postgres
supabase.comยท7h
๐Ÿ’พPrompt Caching
16 Changes to AI in the Enterprise: 2025 Edition | Andreessen Horowitz
a16z.comยท5h
๐Ÿ“ŠModel Serving Economics
Scaling Pinterest ML Infrastructure with Ray: From Training to End-to-End ML Pipelines
medium.comยท22hยท
Discuss: Hacker News
๐Ÿ•ฏ๏ธCandle
6 Key Security Risks in LLMs: A Platform Engineerโ€™s Guide
thenewstack.ioยท19h
๐Ÿ•ณLLM Vulnerabilities
The 20+ most common AI terms explained, simply
threadreaderapp.comยท22h
๐Ÿง LLM Inference
Accelerating Provider MDM in Healthcare with Databricks and AI
databricks.comยท15h
๐Ÿฆ†DuckDB
Using an LLM for query planning in RAG โ€“> 40% better answer relevance
techcommunity.microsoft.comยท18hยท
Discuss: Hacker News
๐Ÿ”„LLM RAG Pipelines
Speaker ID, Database Timeouts & Content Hashing
askthegame.bearblog.devยท14h
๐Ÿ’พPrompt Caching
Why Agentic Flows Need Distributed-Systems Discipline
temporal.ioยท21hยท
Discuss: Hacker News
๐ŸŒDistributed systems
Flynn Was Right: How a 2003 Warning Foretold Todayโ€™s Architectural Pivot
semiwiki.comยท21h
โšกHardware Acceleration
Introducing Northguard and Xinfra: Scalable log storage at Lin...
linkedin.comยท14h
๐Ÿ“‚LiteFS
Alleviating User-Sensitive bias with Fair Generative Sequential Recommendation Model
arxiv.orgยท10h
๐ŸŽ›๏ธFeed Filtering
What LLMs Know About Their Users
schneier.comยท3h
๐Ÿช„Prompt Engineering
An Analogy for Interpretability
lesswrong.comยท23h
๐Ÿ”AI Interpretability
Black-Box Test Code Fault Localization Driven by Large Language Models and Execution Estimation
arxiv.orgยท10h
๐Ÿ•ฏ๏ธCandle
TAI #158: The Great Acceleration: AI Revenue, M&A, and Talent Wars Erupt as the Industry Matures
pub.towardsai.netยท21h
๐Ÿ†•New AI
Loading...Loading more...
AboutBlogChangelogRoadmap